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Hoping to spur dramatic school turnaround, the federal government chan- 
neled resources to the country’s lowest-performing schools through School 
Improvement Grants (SIG). However, prior research on SIG effectiveness is 
limited and focuses primarily on student achievement. This study uses a 
difference-in-differences strategy to estimate program impacts on multiple 
dimensions across the 3-year duration of the SIG award in one urban school 
district. Following 2 years of modest improvement, we find pronounced, pos- 
itive effects of SIG interventions on student achievement in Year 3, consistent 
with prior literature indicating that improvements from comprehensive 
school turnarounds emerge gradually. We also identify improvements indi- 
cating the process through which change occurred, including reduced unex- 
cused absences, increased family preference for SIG schools, improved 
retention of effective teachers, and greater development of teacher profes- 
sional capacity. 


Min Sun is an assistant professor in education policy at the College of Education at the 
University of Washington, 2012 Skagit Lane, M205 Miller Hall (Box 353600), Seattle, 
WA 98195; misun@uw.edu. Her research focuses on educator quality, school 
accountability, and school improvement. 


Emily K. PENNER is an assistant professor in the School of Education at the University of 
California, Irvine. Her research focuses on educational inequality and policy and con- 
siders the ways that districts, schools, teachers, peers, and parents can contribute to 
or ameliorate educational inequality. 


SusANNA Logs is the Barnett Family Professor of Education at Stanford University. She 
specializes in education policy with a focus on school governance and finance and 
educator labor markets. 


Sun et al. 


Keyworps: School Improvement Grants, school turnaround, school perfor- 
mance, school capacity 


Teachers in The Zone continue to work together to develop effective 
and engaging instructional practices. Principals, instructional coaches 
and school support teams provide strong leadership focused on con- 
tinuous improvement. In addition, the adoption of a community- 
schools approach provides for enhanced student supports and 
aligned community partnerships. This combination of essential 
school supports is resulting in significantly improved outcomes for 
students. 
—Guadalupe Guerrero 
Deputy Superintendent for Instruction, 
Innovation, and Social Justice (SFUSD, 2012) 


chool Improvement Grants (SIGs) were part of a broader package of tar- 
S geted federal initiatives intended to spur state and local school improve- 
ment introduced by the former Secretary of Education, Arne Duncan (e.g., 
Race to the Top, No Child Left Behind [NCLB] priority schools). In an effort 
to incentivize dramatic school transformations, Congress appropriated $3.5 
billion for the first wave of SIGs through the American Recovery and 
Reinvestment Act (ARRA) to support states’ “persistently lowest achieving” 
(PLA) schools (U.S. Chamber of Commerce Foundation, 2010; USS. 
Department of Education, 2010a; U.S. Department of Education, 2010b). 
The U.S. Department of Education awarded California, which had the largest 
number of PLA schools in the country, nearly $416 million in SIG funds. The 
San Francisco Unified School District (SFUSD) received $45 million of this 
SIG funding to transform its 10 PLA schools between the academic years 
2011 and 2013 chereafter, we use the spring to refer to the academic year; 
e.g., 2010-2011 as 2011). SIG funding doubled these schools’ budgets during 
the grant period (Wentworth, Khanna, & Piper, 2016). 

SFUSD’s SIG schools serve as examples of the potential effectiveness of 
the SIG program because of SFUSD’s concerted efforts to implement the 
reforms using evidence-based guidelines. SFUSD designed its SIG reform 
plans using the five “essential supports” from the comprehensive school 
reform guidelines drawn from improvements in student-learning outcomes 
in Chicago Pubic Schools (Bryk, Sebring, Allensworth, Easton, & Luppescu, 
2010). Moreover, SFUSD created “the Superintendent's Zone,” an administra- 
tive structure aimed at providing administrative and curricular support to SIG 
schools to promote the successful implementation of SIG reforms (Wentworth 
et al., 2016). Its evidenced-based, comprehensive school improvement frame- 
work and focus on quality implementation make SFUSD a useful site for 
assessing the effects of SIG reforms in an urban district that attempts to use 
best practices to reform its most struggling schools. 
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In spite of the considerable resources marshaled for SIGs and the high 
expectations that SIG awards would produce substantial improvements in 
chronically underperforming schools, research demonstrating the effective- 
ness of SIGs and other whole-school reforms is inconsistent. A comprehen- 
sive report summarizing the research on whole-school reform efforts finds 
limited high-quality evidence assessing their effectiveness (Herman et al., 
2008). To date, the majority of the existing work on SIGs is descriptive in 
nature and focuses on implementation (Council of the Great City Schools, 
2015; Lachlan-Haché, Naik, & Casserly, 2012; Scott & McMurrer, 2015). A 
small number of recent studies that have estimated the causal impacts of 
SIG reforms on student outcomes either only gauge the effects after just 
the first year of the SIG award (Dee, 2012; Dickey-Griffith, 2013) or only 
focus on a limited set of academic outcome measures—mainly student test 
scores (e.g., de la Torre et al., 2013; Papay, 2015; Player & Katz, 2013). 

In this study, we use nearly a decade of longitudinal data to examine SIG- 
program impacts across the full 3-year grant duration in SFUSD. Following 
gradual improvements in the first 2 years of reform, we find pronounced, pos- 
itive effects of SIG interventions on student achievement in the third year. This 
pattern is consistent with the hypothesis that comprehensive school turn- 
arounds need time for positive changes to occur in schools. We find evidence 
of the process of these changes, including the development of ‘essential sup- 
ports” for organizing for school improvement identified by Bryk and col- 
leagues (2010). Our analyses show a reduction in unexcused student 
absences in SIG schools. Families, particularly those with high-achieving stu- 
dents and from higher socioeconomic backgrounds, demonstrated increased 
preferences for SIG schools. SIG schools became better able to retain effective 
teachers and provide them with professional supports. These additional out- 
comes allow us to not only estimate temporal changes but also to examine 
the longer run effects of the SIG supports on these historically low-performing 
schools. To our knowledge, this is the first study that estimates SIG program 
impacts on student achievement across the 3-year period of the grant, incor- 
porates multiple measures of SIG impacts on lowest performing schools, and 
attempts to uncover the mechanisms of change via staff capacity building. 


Background 
Research on Whole-School Reform Efforts 


School reformers have promoted a variety of strategies to remedy under- 
performance in American schools (Coleman et al., 1966; Kantor & Lowe, 
1995; National Commission on Excellence in Education, 1983). Several dec- 
ades of whole-school reform efforts sought to spark improvements by mod- 
ifying or restructuring struggling schools. In the 1980s and 1990s, 
Schoolwide Programs (SWPs) gave schools flexibility to reduce class size, 
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hire staff, expand professional development offerings, increase teacher and 
parent involvement in decision making, and change classroom instruction 
(Wong & Meyer, 1998). Evaluations of the effects of these programs were 
minimal, limited by the use of small, nonrandom samples of participating 
schools, and often lacked causal rigor (Sunderman, 2001, Wang, Wong, & 
Kim, 1999; Wong & Meyer, 1998). Pushing for more dramatic improvement, 
Congress enacted the Comprehensive School Reform (CSR) Demonstration 
program in 1997, intending to improve curriculum, instruction, organization, 
professional development, and parental involvement (Desimone, 2002). 
Evaluations of CSR programs found mixed impacts (Bifulco, Duncombe, & 
Yinger, 2005; Bloom, Ham, Melton, & O’Brien, 2001; Cook et al., 1999; 
Gross, Booker, & Goldhaber, 2009) and wide variation in implementation 
and comprehensiveness of implementation (Aladjem et al., 2006; Berends, 
2000; Berends, Bodilly, & Kirby, 2002; Rowan & Miller, 2007). G. D. 
Borman et al.’s (2003) meta-analysis of studies of 29 CSR programs showed 
positive impacts of several CSR reforms, with the largest effects resulting 
from reforms that were implemented for the longest amount of time (e., 
5 years or more). Title I of NCLB also funded turnaround efforts, prescribing 
dramatic restructuring in hopes of improving student achievement and 
attainment. Under NCLB, schools that failed to meet annual yearly progress 
goals for multiple years in a row were closed and restructured. Ahn and 
Vigdor (2014) found that the threat of closure and leadership change 
improved student test score performance for schools first entering the 
NCLB sanction regime, but schools under threat of weaker consequences 
showed no evidence of improvement. 

Seeking to draw lessons from the mixed track record across multiple 
waves of whole-school reforms, researchers worked to identify best practi- 
ces and develop a theory of action to guide restructuring schools. These 
guidelines highlighted the importance of capacity building among school 
and district leaders and teachers, garnering faculty and parent support and 
commitment through relationship-building, implementing strategies from 
research-based plans, giving greater flexibility to adapt reform and financial 
resources to specific contexts, and making visible improvements early on in 
the turnaround process (K. M. Borman, Carter, Aladjem, & LeFloch, 2004; 
Herman et al., 2008; Hess, 1999; Malen & Rice, 2004; Mintrop & Trujillo, 
2005; Spillane & Thompson, 1997). In addition, Bryk et al. (2010) generated 
conclusions from fieldwork in Chicago to create a “theory of practice” 
around school transformations that echoed many of the conclusions drawn 
from other whole-school reform efforts. 


Research on School Improvement Grants 
In spite of multiple policy efforts to spur changes in low-performing 


schools, many schools continued to struggle. The Obama Administration 
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drew national focus to chronically underperforming schools by targeting SIG 
funds to PLA schools, making this policy a marquee component of the 
American Recovery and Reinvestment Act in 2009. PLA schools were defined 
as schools that were eligible for Title I assistance with baseline achievement 
in the lowest 5% (based on 3-year average proficiency rates) and that had 
made the least progress in raising student achievement over the previous 
5 years. To receive funding, SIG schools were required to adopt one of 
four intervention models beginning in 2011 (U.S. Department of 
Education, 2010a). The transformation model required replacing the princi- 
pal, implementing curricular reform, introducing teacher evaluations based 
in part on student performance, and incorporating evaluation results into 
personnel decisions (e.g., rewards, promotions, retentions, and firing). 
The turnaround model included all of the requirements of the transforma- 
tion model, as well as replacing at least 50% of the staff. The restart model 
required the school to close and reopen under the leadership of a charter 
or education management organization. Finally, the closure model simply 
closed the school. 

Early research examining SIGs is primarily descriptive, providing progress 
reports focused on implementation (Council of the Great City Schools, 2015; 
Lachlan-Haché et al., 2012; Scott, Krasnoff, Davis, & Northwest, 2014; Scott & 
McMurrer, 2015). Evidence of SIG impacts on student outcomes is emerging. 
Two studies—Dee (2012) and Dickey-Griffith (2013)—examine first-year 
impacts. Dee (2012) uses a “fuzzy” regression discontinuity design based on 
two school-level eligibility thresholds—‘lowest achieving” and “lack of pro- 
gress’ —and finds significant improvement in posttreatment performance in 
schools whose baseline proficiency rate just met the lowest achieving thresh- 
old but not among schools on the “lack of progress” margin. Dee also finds 
some evidence that SIG awards contribute to reductions in suspensions and 
truancy rates, but primarily among “turnaround” schools, which undergo 
more dramatic staff and principal replacement than other models. In contrast, 
Dickey-Griffith (2013) uses a difference-in-differences approach to assess 
1-year impacts in Texas and finds mixed results, including negative impacts 
on student achievement in elementary and middle school and positive effects 
on high school graduation rates. 

Recent work also examines SIG impacts beyond the first year, again pro- 
viding mixed evidence of SIG effectiveness. Papay (2015) finds large, posi- 
tive effects on math and English language arts CELA) scores of being 
identified as SIG-eligible, which grow from the first to the third year of 
implementation in Massachusetts. A new report from the U.S. Department 
of Education uses data from 22 states and finds a null impact on test scores, 
high school graduation, and college enrollment for the cohort of schools 
funded in 2010 (Dragoset et al., 2017). A possible explanation for the differ- 
ence in findings across studies is the variation in the design and implemen- 
tation of SIG interventions across districts and states. Another possible 
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explanation is that the heterogeneous results may result from sample selec- 
tion and estimation strategies, as illustrated in Henry and Guthrie (2016)’s 
work in North Carolina. 

Several large urban districts embedded SIG schools within other reform 
approaches, particularly “portfolio models” (Hill, 2006). For example, Los 
Angeles Unified School District (LAUSD) implemented a Public School 
Choice Initiative (PSCD, in which stakeholders compete to turn around the 
district’s lowest performing “focus” schools. Initial research finds that reform 
plans were only sometimes associated with reported implementation and 
had inconsistent effects on student achievement across three rounds of 
PSCI-driven turnarounds (Strunk, Marsh, Bush-Mecenas, & Duque, 2015; 
Strunk, Marsh, Hashim, & Bush-Mecenas, 2016; Strunk, Marsh, Hashim, 
Bush-Mecenas, & Weinstein, 2016). In contrast, research examining New 
Orleans portfolio district reforms indicates positive effects on both student 
achievement and behavior (Barrett & Harris, 2015; Harris & Larsen, 2016; 
McEachin, Welsh, & Brewer, 2016; Welsh, Duque, & McEachin, 2016). 

A companion initiative under the Obama administration, Race to the Top 
(RT), funded similar, highly prescribed, school turnaround strategies. 
Evidence from several states that won RttT funding provides mixed evidence 
of effectiveness. Heissel and Ladd (2016) find negative effects of the program 
in North Carolina, and Zimmer, Henry, and Kho (2015) find some positive 
effects in Tennessee, particularly among Innovation Zone schools that 
were managed by school districts. 


SIG and School Turnaround Models in SFUSD 


As districts across the country drafted SIG proposals, SFUSD’s central 
office prepared an application and conducted a needs assessment, examin- 
ing the challenges and priority needs of each of the 10 SIG-eligible schools. 
The needs assessment indicated the 10 schools had incoherent curricula, 
assessments, and instructional guidance; insufficient resources and class- 
room materials; a lack of comprehensive interventions and monitoring of 
student progress; and haphazard implementation of improvement strategies 
that rarely lasted beyond a few years. SIG-eligible schools lacked resources 
to engage with families and did not comprehensively meet the needs of the 
community. Principals in some of the schools lacked the instructional leader- 
ship needed to dramatically improve student performance. In addition, the 
secondary schools experienced low engagement and high truancy 
(SFUSD, 2010). 

In response, a district committee created a joint application for the 10 
schools, first identifying the reform model each school would adopt. 
Because SFUSD had more than nine SIG-eligible schools, it could only use 
the transformation model in up to half of the eligible schools (Norton, 
2010). During this process, district leaders sought strategic input from 
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stakeholders and held school-site discussions and community meetings at 
each school; five schools chose the transformation model, four schools 
chose turnaround, and the lowest performing school chose closure 
(California Department of Education, 2010). 

Based on the needs assessment, the district adopted the five “essential 
supports” from Bryk et al.’s (2010) “Organizing Schools for Improvement: 
Lessons from Chicago” to develop a coordinated effort for school improve- 
ment in these nine schools. This plan highlights the ways in which each sup- 
port fits within the required components of the SIG application, which is 
summarized below. 


e Activating school leadership as the driver for change: In addition to removing 
principals who had been at a SIG school for more than 2 years and providing 
new principals with more flexibility over hiring, SFUSD redesigned the ways in 
which the district central office provided support to schools. The SIG schools 
were organized into two zones—Bayview and Mission—with corresponding 
district resources to strengthen management and provide continuous support 
and mentoring for school personnel (SFUSD, 2010; Wentworth et al., 2016). 

e Developing professional capacity among teachers: SFUSD provided job-embed- 
ded teacher professional development featuring one-on-one coaching. 
Moreover, SFUSD instituted a performance management system using common 
interim assessments and other evidence of student learning to improve teaching 
practice (SFUSD, 2012). 

e Cultivating cohesive instructional guidance that promotes ambitious academic 
achievement for every child: SIG schools were required to implement 
a Common Core curriculum that clearly specified what students should know 
and be able to do and set high standards for rigor and instructional quality. 
The schools also administered common interim assessments that tracked stu- 
dents’ progress in meeting the standards. The schools partnered with third par- 
ties (e.g., Teacher’s College, WRITE Institute, Algebraic Thinking & The Algebra 
Project, Project SEED, Tools for Schools, etc.) to focus on improving math and 
literacy instruction (SFUSD, 2010). 

e Nurturing a student-centered learning climate: SIG schools extended learning 
time for students both after school and during the summer and implemented an 
early-warning monitoring system of student progress. In addition, secondary 
SIG schools promoted a college-going culture (SFUSD, 2010). 

e Fostering parent-community ties: All SIG schools implemented a community- 
school approach beyond parent workshops that built family and community 
involvement and outreach (SFUSD, 2010). 


SFUSD’s application was successful. It received nearly all of the $45 million 
it requested. The district used the SIG money to implement the reforms outlined 
in the proposal, continuing to adopt the language of the Bryk at al. (2010) com- 
prehensive reforms. Although the staffing transitions were more comprehensive 
at the turnaround schools than the transformation schools, all nine used the 
guidelines outlined in the application to structure their reforms. 
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The theory of change behind SFUSD’s reforms is based on the belief that 
an incremental change is not sufficient to reform the “dysfunctional organi- 
zations” in low-performing schools and that dramatic restructuring of staff, 
curriculum, and environment is necessary (Malen & Rice, in press). This the- 
ory is similar to that articulated by Strunk, Marsh, Hashim, and Bush- 
Macenas (2016) suggesting that districts rely both on incentives to improve 
the productivity of staff in reconstituted schools, specifically the threat of 
additional reconstitution, and on school capacity reinforcements, including 
new staff, additional training, and funding, to support reforms. Research 
on multiple waves of whole-school reforms also demonstrates the need to 
build school capacity in order to yield systematic and sustained positive 
changes (K. M. Borman et al., 2004; Herman et al., 2008; Hess, 1999; 
Malen & Rice, 2004; Mintrop & Trujillo, 2005; Spillane & Thompson, 1997). 
This capacity-building strategy often requires time for schools to implement 
reconfiguration and install supports for teaching and learning. 

Because SFUSD chose to model its reforms on Bryk et al.’s (2010) five- 
essential supports model, we assess impacts in several areas that might be 
indicative of such targeted reform efforts. First, we evaluate whether SIG 
schools progressed toward nurturing student-centered learning climates, 
coherent instructional guidance, and improved parent-community ties by 
examining changes in student achievement, attendance, and parent prefer- 
ences for school placement. Developing a student-centered learning climate 
should result in improved student performance and increased attendance 
(Harris & Larsen, 2016; Jackson, 2012; McEachin et al., 2016). Improved 
parent-community ties and increased curricular rigor should make SIG 
schools more popular among families rather than stigmatize these schools 
as undesirable (Heissel & Ladd, 2016; Welsh et al., 2016), as revealed by pref- 
erences in school placement processes (Hastings, Kane, & Staiger, 2005; 
Hastings & Weinstein, 2008). In addition, changes in leadership, instructional 
guidance, and teacher professional capacity should be evident in improved 
retention of skilled teachers and in teacher-reported working environment, 
collaboration, administrative support, and mentorship (Barrett & Harris, 
2015; Strunk, Marsh, Hashim, & Bush-Mecenas, 2016). We examine changes 
to the teacher workforce on measures of effectiveness and use survey results 
that include self-reports about teacher professional support, collaboration, 
and mentoring from school leadership. 

The approaches embedded in SFUSD’s SIG application and subsequent 
reforms feature prominently in current policy prescriptions for improving 
struggling schools under ESSA. Yet evidence supporting their effectiveness 
is limited to a sparse set of outcome measures that do little to illuminate 
the mechanisms driving change. This study seeks to address this limitation 
by providing a multiyear, in-depth evaluation of SIG reforms. It examines 
not only student outcome measures but also several indicators of organiza- 
tional change that speak to the process through which SIG schools 
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conducted their turnaround. This study provides the most thorough evalua- 
tion of SIG reforms to date. 


Data and Methods 


The data used in this study come from SFUSD. In the 2014-2015 school 
year, SFUSD was California’s sixth largest district, serving approximately 
58,000 students (California Department of Education, 2015). SFUSD’s student 
body is both racially and socioeconomically diverse: 26% of its students 
identify as Latino, 41% as Asian, 11% as White, 10% as African American, 
1% as Native American, and 10% as other. Twenty-seven percent speak 
English as a second language, and 61% are eligible for free or reduced-price 
lunch. SFUSD employs over 3,500 teachers to serve this student body. 
SFUSD’s teaching force is also more diverse than the national average: 
53% of teachers identify as educators of color compared with 18% of public 
school teachers nationwide (National Center for Education Statistics, 2015). 
SFUSD teachers have 11 years of teaching experience on average, which 
is slightly less than the national average of 14 years (National Center for 
Education Statistics, 2015; SFUSD, 2015, 2016). SFUSD schools demonstrate 
a substantial amount of performance heterogeneity, including both gold rib- 
bon schools, acknowledged for outstanding and innovative performance, 
and SIG schools, identified as within the bottom 5% of the persistently low- 
est performing across the state in the same year. 

Our analyses use SFUSD administrative data on students, teachers, and 
their schools from 2005 to 2013. We supplement the administrative data 
with 4 years of personnel survey data from 2010 to 2013. We exclude the 
one closure school from the analysis because the SIG award to this school 
was mainly used to facilitate students’ transitions to new schools at the 
end of spring 2011 rather than invested in improving school capacity to raise 
students’ learning outcomes. The SIG schools in our analysis sample, thus, 
include the five transformation and four turnaround schools. 


Analytic Samples 


Because of concerns regarding parent responses to SIG reforms that 
might motivate them to transfer their student in or out of SIG schools in 
response to the reform efforts, we estimate SIG effects using two 
approaches. The first approach compares student outcomes during the grant 
period (e.g., from 2011 through 2013) between those who were in SIG 
schools in fall 2010 G.e., right at the beginning of the reform) and those 
who were in non-SIG schools at the same time, regardless of whether 
they transferred out of these schools in subsequent years. Hereafter, we 
call this group the “all starters.” The estimate of SIG effects on this sample 
is analogous to what is called an intent-to-treat (TT) effect in the experimen- 
tal research literature, in that it represents the average effects for students 
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who were in their assigned “treatment” conditions—either SIG or non- 
SIG—prior to the implementation of the intervention. The analysis sample 
does not include cohorts of students who newly enrolled in the schools in 
2011 and 2012 after the intervention started, because parents might select 
or avoid SIG schools due to the SIG awards. Inclusion of these new cohorts 
may introduce bias in the estimation of SIG treatment effects. 

While the “all starters” sample most cleanly removes issues of selection 
based on SIG assignment from the estimation, it may not accurately estimate 
the SIG effect because many of the “all starters” do move from the SIG 
schools and thus are not subject to the advantages or disadvantages of the 
SIG intervention. Our second approach further limits the sample to include 
only students who were in the same schools for at least 1 year prior to and 1 
year after fall 2010 and did not transfer between schools during the interven- 
tion period. We call this group “stayers,” which is somewhat analogous to 
the traditional “treatment-on-treated” sample in the experimental research 
literature. The estimate of SIG effects from this sample covers only those stu- 
dents who actually received the treatment of attending a SIG school for at 
least 1 year. Results are largely consistent between these two samples. We 
focus on the “all starters” sample in the main text, which provides the 
more conservative estimates, and include all corresponding results for the 
“stayers” sample in the online appendix. 

Table 1 summarizes the descriptive comparisons of baseline student 
attributes in 2010 between SIG and non-SIG schools and between turn- 
around and transformation SIG schools for the two samples described 
above. Almost all of the observed preintervention student characteristics dif- 
fer significantly between SIG and non-SIG schools. For example, SIG schools 
served students who were lower performing, had more disciplinary issues as 
indicated by the days of unexcused absences and suspensions, were more 
likely to be minorities, to be English language learners (ELLs), and to 
come from socioeconomically disadvantaged families. Similarly, among 
SIG schools, turnaround schools served lower performing, higher minority, 
and more socioeconomically disadvantaged students in 2010 than did trans- 
formation schools. 

We describe changes in student composition prior to and during each year 
of the reform in online appendix Table B1. SIG schools kept students with 
higher prereform average math and ELA scores in postreform years in the 
“stayers” sample, underscoring the importance of controlling for prereform dif- 
ferences in student characteristics and achievement in our estimation of SIG 
impacts. These controls account for peer changes and sample selection.’ 


Analytical Approaches by Types of Outcome Measures 


To examine a broad spectrum of SIG impacts in response to reforms 
grounded in the “essential supports” outlined by Bryk and colleagues 
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Table 1 
Comparisons of Student Characteristics for “All Starters” 


SIG Non-SIG Transformation Turnaround 


Math standardized test scores* —0.69 (0.70) 0.11 (1.00) -0.65 (0.70) —0.80 (0.71) 
ELA standardized test scores* —0.62 (0.82) 0.10 (0.99) -0.56 (0.81) -0.75 (0.83) 


Days of excused absences 5.81 (8.33) 4.31 G.96) 5.60 (8.13) 6.24 (8.72) 
Days of unexcused absences* 11.93 (15.56) 6.74 (13.37) 12.43 (15.95) 10.89 (14.68) 
Days suspended* 0.19 (1.04) 0.08 (0.68) 0.16 (0.91) 0.27 (1.25) 
Race: White 0.02 (0.15) 0.11 (0.31) 0.03 (0.16) 0.02 (0.14) 
African American“ 0.18 (0.38) 0.09 (0.29) 0.12 (0.33) 0.30 (0.46) 
Hispanic* 0.61 (0.49) 0.21 (0.41) 0.65 (0.48) 0.54 (0.50) 
Asian* 0.12 (0.33) 0.51 (0.50) 0.16 (0.36) 0.06 (0.23) 
Other" 0.06 (0.24) 0.07 (0.26) 0.05 (0.22) 0.08 (0.28) 


Students in special education 0.14 (0.34) 0.11 (0.31) 0.14 (0.35) 0.13 (0.34) 
programs 

English language learners* 0.46 (0.50) — 0.28 (0.45) 0.48 (0.50) 0.42 (0.49) 

Log of neighborhood median 10.95 (0.48) 11.10 (0.45) — 11.02 (0.43) 10.81 (0.53) 
household income* 

N (students) 2,644 37,094 1,782 862 


Note. Means are reported with standard deviations in parentheses. 
“Significant differences in means between turnaround and transformation schools. 


(2010), we examine student attendance, student achievement, family prefer- 
ences for SIG schools, teacher retention based on effectiveness and seniority, 
and teacher support. While these are not by any means an exhaustive set of 
indicators of these changes, our outcome measures do provide several 
pieces of evidence indicating the extent to which changes aligned with 
the reform’s theory of action. Due to the varied nature of the outcome meas- 
ures, we employ several analytic strategies and functional forms, which we 
describe in detail below. 


Student Achievement 


Because restricting analysis to either the “all starters” or “stayers” sample 
leads to a small number of students with test scores prior to 2008, the main 
analysis on student achievement uses 6 years of data from 2008 through 
2013. We define whether a student was in a SIG school or a non-SIG school 
by his or her school attendance in the year before implementation (.e., fall 
2010). We then test whether students who initially attended SIG schools 
showed higher achievement over the subsequent 3 years than students 
who initially attended non-SIG schools‘ relative to prereform differences 
between SIG and non-SIG schools, controlling for their preintervention 
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characteristics. The logic is similar to a traditional difference-in-differences 
(DD) approach.’ Equation 1 describes the model.° 


Aigst= 0 + By Year2q11 + By Year2g12 + B3 Year2g13 + B4( Year2017) (SIGs) 
af Bs ( Year2012) (SIG,) ar Be ( Year2013) (SIGS) + Xigst'Y1 +O, =F Vs + Ejgst, 


(1) 


where Ajgs; is the math or ELA standardized test score of student i in grade g, 
school s, and year ¢ on the California Standards Tests (CST). Although the 
subscript for subjects is omitted, we conduct the estimation separately for 
math and ELA. Yearzo9;; is the dummy indicator for observations in 
2011—the first year of SIG interventions; Yeadr2g;2 indicates the second 
year; and Yedr2o;3 indicates the third year. S/G, is a time-invariant school- 
level indicator for the nine SIG schools. By, Bs, and Bg indicate the treatment 
effect estimate in each of the treatment years by contrasting the difference in 
the average student achievement between the pre- and post-2010 school 
years in SIG schools with the difference in the average achievement between 
pre- and post-2010 in non-SIG schools. 

Students were not randomly assigned to schools before the intervention. 
To account for student selection bias, student controls, Xjg.,, are added to the 
model, including students’ race and ethnicity (Black, Hispanic, Asian, 
others), gender, ELL and disability designations, and whether either parent 
has a BA degree or higher. Instead of using lunch subsidies as a proxy for 
student socioeconomic background, we use several measures of student 
neighborhood socioeconomic status via their geocoded home addresses. 
By linking students’ geo-coded addresses with the U.S. Census Bureau 
American Community Survey (ACS) data, we obtained the 5-year character- 
istics of neighborhoods (2007-2012) where the students lived, including the 
log of median household income, percentage with a bachelor’s degree or 
higher among residents who are 25 and older, percentage of residents 18 
or under living below the poverty threshold, and the log of median housing 
value (owner occupied). We include students’ average achievement in the 
subject area prior to 2010 in order to further account for their preintervention 
differences. A second specification controls for students’ prior-year test 
scores to capture SIG effects on the year-to-year student improvement, 
rather than controlling for students’ average achievement prior to 2010. 
Although this second model may underestimate the treatment effects since 
it adjusts for a score that is likely a function of the treatment, it has the poten- 
tial advantage of absorbing more of the differences between students in the 
SIG and non-SIG schools. Both models also include grade fixed effects, w, 
to account for differences in academic tests across grades, and school fixed 
effects, v., to control for time-invariant heterogeneity across schools. éjgs; is 
the error term. Because SIG strategies are whole-school reform efforts, we 
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estimate cluster robust standard errors at the school level to adjust for corre- 
lations within schools and the influence of small number of treatment clus- 
ters on standard error estimates.’ 

We then use a similar strategy to estimate potential differential effects of 
transformation and turnaround models: 


A igs = 0 + By Yearz011 + Bx Year2o12+ B3 Year2o13+ B4( Year2o11)( Transformations) 
+B5;( Year212) (Transformations) + Bg ( Year2o13)( Transformations) 


(2) 


+B ( Year29;1)(Turnaround,) + Bg ( Yearzq12) (Turnaround, ) 


+Bo( Yeara913)(Turnaround,) + Xjgs'¥ + Og + V5 + jgsr, 


where Transformation, is a time-invariant school-level indicator for schools 
that implemented the transformation model, and Turnaround, indicates 
schools that chose the turnaround model. 84, Bs, and Bg estimate SIG effects 
in transformation schools (relative to non-SIG schools) separately by each 
intervention year, while B7, Bs, and Bo capture SIG effects in turnaround 
schools (relative to non-SIG schools). The remaining are consistent with 
those in Equation 1. The numbers of Transformation and Turnaround 
schools are small and, as a result, we treat these estimates with caution. 


Student Absences 


We use data from 2008 through 2013 to estimate the effects of SIG 
reforms on student absences. Because absences are a relatively rare occur- 
rence for most students, we use negative binomial models to estimate the 
count of students’ full-day absences as a function of SIG policy treatment 
and student and family characteristics.* The identification strategy is the 
same as illustrated in Equation 1, estimating SIG effects using the compari- 
son of the change in the average probability of student absences before 
and after the reform, between SIG and non-SIG schools. Beyond including 
student controls that are used to model student achievement, when model- 
ing absences, we also control for the distance from a student’s home to his or 
her school to account for absences due to transportation difficulties. 

We estimate effects separately for excused and unexcused absences. A 
full-day absence was recorded for pupils who were absent for more than 
84% of the regularly scheduled school day. The State of California 
Education Code 48205 states that a legitimate excused absence has to be ini- 
tiated by parents or legal guardians. Excused absences can be due to student 
illness, medical appointment, or justifiable personal reasons, including an 
appearance in court, attendance at a funeral service, religious holiday or cer- 
emony, or a visit to a college or university.” Because of the relatively restric- 
tive rules for legitimate excused absences, we anticipate less variation in 


619 


Sun et al. 


excused absences across schools and over time, as well as less change within 
schools following the implementation of SIG reforms. In contrast, we expect 
unexcused absences to align more closely to student and parent school 
engagement, as well as monitoring systems of student progress, all of which 
are the targets of SIG reforms. 


Family Preferences 


As an indication of community and parent responses to SIG improve- 
ment efforts, we use student-family school choice on enrollment preference 
forms. SFUSD uses a Student Assignment System, which has been in place 
since 2003, to assign all students to all of its schools through a choice process 
designed to provide equitable access to the range of opportunities available 
in San Francisco’s public schools. This process is described in greater detail 
in online appendix C. The large majority of students submit choice forms 
when they enter kindergarten, sixth, and ninth grades, when they initially 
enter the district, or if they want to transfer schools (ranging from 60% to 
70% in the early 2000s to about 90% in more recent years). 

Our choice analysis restricts the sample to all students who could have 
chosen the SIG schools over 9 years from 2005 through 2013. This includes 
all students who applied for the grade level in which a school receives a new 
cohort of students. For example, all kindergarten applicants are the potential 
choosers of an elementary school; similarly, all sixth-grade applicants, of 
a middle school; and all ninth-grade applicants, of a high school. 
Although many students listed more than one choice, we model students’ 
first choices, because these schools are families’ most desired choice.'° 

The identification strategy of comparing top preferences for SIG schools 
and non-SIG schools would not be appropriate for analyzing choices, because 
the increase in desirability of SIG schools would, by default, result in a decrease 
in non-SIG schools’ desirability. In other words, the change in non-SIG schools’ 
trends is dependent on the change in SIG schools’ trends, and vice versa. This 
interdependency would violate the common-trends assumption of a difference- 
in-differences approach. Instead, we use an interrupted time series CTS) 
approach to identify the postintervention deviations from the preintervention 
trend in student choice of the SIG schools. We model the likelihood that a stu- 
dent chooses a SIG school as his or her first choice (y= 1) as a function of post- 
intervention duration (Yedr2911, Yedr2o12, and Yedr2o73), controlling for the year 
trend (Year), the same student characteristics as in Equation 1 and the proximity 
from his or her home to the school in the vector of X;, and school fixed effects 
(v,) (Burgess, Greaves, Vignoles, & Wilson, 2014, Hastings & Weinstein, 2008). 
A logit regression model is summarized in Equation 3, where Year is a year lin- 
ear term, centered on the reform year of 2010, and the standard errors are 
school-level cluster robust standard errors. 
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POW= ) =agt+a, Year+ B, Yedr2o11+ B> Yedr2o12+ B3 Year2o13 
1 — p(yvir=1) (3) 
+XrVy TVs FE jr. 


log( 


Effective Teacher Retention 


SIG schools aim to disrupt retention policies based solely on seniority 
and implement a system that prioritizes the hiring and retention of effective 
teachers. Without knowing all dimensions that principals use to select teach- 
ers, we measure teacher effectiveness using an annual value-added measure, 
which gauges teachers’ contribution to raising student achievement (see 
online appendix A for details on the estimation of value-added). We average 
3 years of value-added measures in their respective subjects—the current 
year and two prior years—to create our teacher effectiveness measure. 
This measure accounts for concerns about year-to-year fluctuation of 
value-added measures due to the variation in true teacher performance 
over time and measurement error (Loeb & Candelaria, 2012).'' In each 
year, we have between 66 and 88 teachers in the nine SIG schools with 
value-added, which represents approximately 22% of teachers in these 
schools. 

As shown descriptively in online appendix Table B2, SIG schools kept 
teachers with higher value-added scores during the reform period than the 
prereform year. To formalize this observation, we use a strategy, similar to 
the difference-in-difference-in-differences (DDD) framework, with a condi- 
tional logit function to examine whether the relationship between teacher 
effectiveness and retention became stronger in SIG schools relative to 
non-SIG schools in post-SIG years, compared to the pre-SIG years. 


PQiw=1) 
1 — pyja=1) 
+B5( Year2911) (SIG,) + Bg ( Year212) (SIGs) + B7( Yearzo13)(SIG;) 
+Bg( Yearzo11) (Effectiveness) , + Bo ( Yearzo12) (Effectiveness) 
+B io( Yearzo13) (Effectiveness) ,., +B | (SIGs) (Effectiveness) ., 
+B 1>( Yearzo11) (SIG;) (Effectiveness) 
)( 


) = ao + Bi Yearz011 + Bz Year2012 + B3 Year2013 + B4( Effectiveness) ,,, 


log 


st + By3 (Yearzo12) (SIGs) (Effectiveness) ,., 


+B 14( Yearzo13)(SIG;) (Effectiveness) 


jst | XV TVst Ejst, 


(4) 


where yj, is the retention status of teacher / in school s and year ¢ (1” = stays 
in current school in the following year, excluding retirement; “0” = 
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otherwise). Although the subscript for subjects is omitted, we conduct the 
estimation separately for math and ELA teachers. (Effectiveness); indicates 
the 3-year average teacher value-added estimates. The coefficients of the 
three-way interactions, By2, B13, and B,4, indicate the SIG effects on retaining 
effective teachers in Year 1, Year 2, and Year 3, respectively. Xj; includes 
teacher demographics and professional background (e.g., having a master’s 
degree, majored in education in the highest degree, and in first 3 years of 
teaching), as well as school characteristics (percentage of White, Black, 
Hispanic, or Asian students; average school-level days of suspension, per- 
centage novice teachers, average of student socioeconomic characteristics). 
v, indicates school fixed effects. Again, we used cluster robust standard 
errors at the school level. In a second model, we replace effectiveness 
with being an experienced teacher (e.g., >3 years of teaching experience) 
to assess whether experienced teachers became more or less likely to stay 
in SIG schools during reform period. 


Teacher Supports 


We also investigate how well SIG schools succeeded in developing the 
professional capacity of their teachers, creating cohesive instructional guid- 
ance, and using leadership as a driver for change using annual teacher sur- 
vey data between 2010 and 2013.'* We use a set of the questions from the 
surveys that focused specifically on teachers’ reports of the supportiveness 
of their school environments, their mentoring from school leaders, and their 
collaboration and mutual support as a teaching team. Teachers were asked, 
on a 7-point scale including never (0), once (1), twice (2), 3 or 4 times (3), 
5-9 times (7), and 10 or more times (10) within each year, about the fre- 
quency of (a) visiting another teacher’s classroom to watch him or her teach; 
(b) having a colleague observe your classroom; (c) inviting someone in to 
help your class; (d) going to a colleague to get advice about an instructional 
challenge you faced; (e) receiving useful suggestions for curriculum material 
from colleagues; (f) receiving meaningful feedback on your teaching prac- 
tice from colleagues; (g) receiving meaningful feedback on your teaching 
practice from your principal; and (h) receiving meaningful feedback on 
your teaching practice from another school leader (e.g., AP, instructional 
coach). We derive a composite measure of teacher supports by taking the 
mean across these items.'* 

We examine changes in teacher supports from 2010 to 2013 in SIG 
schools relative to non-SIG schools using an approach similar to Equation 
1. The analysis includes responses from all teachers present in each year, 
because these survey results are intended to take the pulse of the current 
teaching climate in SIG versus non-SIG schools in both the pre-SIG and 
during-SIG periods. These models include the composite measure of teacher 
supports as the dependent variable, post-SIG year indicators, the school and 


622 


Resource- and Approach-Driven Multidimensional Change 


teacher controls described above, and school fixed effects, and use school- 
level robust standard errors. 


Robustness and Falsification Tests 


There are three potential threats to the causal inference of the DD 
design. First, DD designs assume that trends in SIG schools would have 
been the same as those in non-SIG schools without the reforms. We examine 
prereform trends to assess the validity of this assumption. A second concern 
is that other factors produced or contributed to any changes in SIG schools at 
the same time as the SIG reforms, which is difficult to assess. However, we 
provide some evidence of the prominence of the SIG reforms relative to any 
other concurrent factors. A third potential concern is mean-reversion, in 
which the lowest achieving schools experience larger than average gains 
in years following the SIG intervention (e.g., Ahn & Vigdor, 2014; Figlio & 
Rouse, 2006). In other words, the increase in student achievement in years 
following SIG implementation is not due to a SIG treatment effect but rather 
due to the unusually low scores prior to the intervention. Using techniques 
similar to those used by Figlio and Rouse (2006), we assess the threat of 
mean reversion in our data. 


Results 


For three of our outcome measures (achievement, attendance, and fam- 
ily preferences), we present a series of figures that graphically illustrate our 
analytic approach, followed by regression estimates in the “all starters” sam- 
ples. We then present differential SIG effects for transformation and turn- 
around schools. For our remaining outcomes (teacher turnover and 
teacher supports), we present regression estimates of SIG effects across 
the 3 years of SIG reform. 


Student Achievement 


Our analysis provides evidence that SIG interventions significantly 
increased average student achievement in math and ELA and that the treat- 
ment effect is most pronounced in the third year of the intervention. Figure 1 
compares the trends in average student achievement between SIG and non- 
SIG schools. Figure 1a shows that prior to reform, the average math score of 
the SIG “all starters” sample was —0.68 standard deviations (SD) in spring 
2008 and —0.69 SD in spring 2010. The average math score of non-SIG 
schools was considerably higher, 0.17 SD in spring 2008 and 0.11 SD in 
spring 2010, resulting in a significant 0.80 SD gap in average math achieve- 
ment right before the SIG intervention started. Notably, the pre-SIG trends 
are almost parallel in these two types of schools. After fall 2010, in obvious 
contrast to the pre-SIG trend, the mean math achievement raised much more 
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Figure 1. Comparison of trends in student achievement between SIG and non-SIG 
schools for “all starters.” 
Note. The “all starters” sample includes students those who were the district in fall 2010, 


regardless of whether they transferred between schools in subsequent years. 


quickly in SIG schools than in non-SIG schools. By spring 2013, the third 
intervention year, the gap in average math achievement declined to 0.50 
SD (i.e., 0.08 — [-0.42]). Figure 1b shows analogous results for ELA. 

Table 2 presents regression estimates that formalize the patterns we 
observe in Figure 1. For each subject area, we have two model specifica- 
tions: one which includes a control for average student achievement 
observed prior to the SIG reforms (Model 1), and a second which includes 
lagged student achievement from the prior year (Model 2). As indicated in 
the columns of “all SIG schools” in Model 1, the estimated SIG effect in 
math is 0.12 SD in 2011 and 0.07 SD in ELA. The Year 2 point estimates 
are higher in both math and ELA. In Year 3, the estimates are positive and 
significant: relative to the change in non-SIG schools, we estimate that SIG 
interventions improved student achievement by 0.24 SD in math with con- 
trols for average achievement prior to SIG and generated an average year- 
to-year improvement of 0.15 SD. We estimate that the SIG interventions sig- 
nificantly increased average ELA achievement by 0.12 SD in Year 3 and gen- 
erated year-to-year improvements of 0.02 SD. Results for the “stayers” 
sample in online appendix Table B3 show consistent, somewhat larger pos- 
itive effects of SIG interventions in both math and ELA in Year 3. 

Although transformation and turnaround schools adopted many similar 
interventions, turnaround schools also replaced leaders and staff, potentially 
resulting in different treatment effects. We use Equation 2 to estimate SIG 
effects on student achievement in these two types of schools. Our analyses 
by reform type may be more exploratory than causal, because as illustrated 
in Figure B2 in the online appendix, the common trends assumption may not 
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Figure 2. Trends in full-day student absences in both SIG and non-SIG schools 
for “all starters” 
Note. The “all starters” sample includes students who were the district in fall 2010, regardless 


of whether they transferred between schools in subsequent years. 


hold for some model specifications. However, it is still worthwhile to present 
the analyses, because prior literature suggested differential effects between 
these two types of reform models (Dee, 2012). As shown in the 
“Transformation” and “Turnaround” columns in Table 2, consistent across 
both reform models and in both subjects, SIG effects in the third year are gen- 
erally larger than in the first 2 years of the intervention. Additionally, between- 
group comparisons suggest larger increases in mean math achievement in 
turnaround than transformation schools across all 3 years. For example, for 
the “all starters” sample in 2011, the estimated effect on average improvement 
controlling for average scores prior to the reform is 0.09 SD in transformation 
schools, which is smaller than the estimated 0.17 SD change in turnaround 
schools (F = 3.67, p < 0.1). Similarly, in 2012, the estimated average effect 
is 0.07 SD in transformation schools, compared with a much larger estimate 
of 0.32 SD in turnaround schools (F = 3.45, p < 0.1). In 2013, transformation 
schools had an estimated average effect of 0.16 and turnaround schools had 
an estimated effect of 0.47 (F = 3.34). Although none of the differences in 
the estimated effects on ELA between these two types of schools are statisti- 
cally significant, the estimated effects in turnaround schools are still slightly 
larger than those in transformation schools. 


Absences 


Our analyses provide some evidence of changes in student attendance 
in response to increased monitoring of student progress under SIGs, but 
not strong evidence of effects. Figure 2 illustrates the changes in average 
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full-day excused and unexcused absences from 2008 to 2013, separately for 
SIG and non-SIG schools. Figure 2a shows the average days of excused 
absences in the “all starters” sample and shows that SIG reforms did not 
lead to meaningful decreases in excused absences, because the gap in aver- 
age days of excused absences between SIG and non-SIG schools did not 
change meaningfully from the pre- to postintervention periods. Figure 2b 
shows the average days of unexcused absences in the “all starters” sample 
and indicates decreases in both SIG and non-SIG schools after program 
implementation. Figure 2 suggests that SIG reforms may have reduced unex- 
cused absences, as the gap between SIG and non-SIG schools shrank a little 
in the postintervention period, but the causal effects are not as clear as for 
achievement because the differences between SIG and non-SIG schools 
were closing in the years before reform as well. 

The regression estimates in Table 3 confirm that SIG reforms had close-to- 
zero influence on students’ excused absences but did reduce the likelihood 
that students had unexcused absences. For example, the incidence rate for 
full-day unexcused absences decreased by 18% in Year 1 in the “all starters” 
sample, by 24% in Year 2, and by 12% in Year 3. Although only Year 2 SIG 
effect estimates are consistently significant across model specifications, all esti- 
mates for unexcused absences are negative. These discrepant findings across 
the two types of absences are understandable given that unexcused absences 
are more likely to be malleable and a function of factors such as parental 
engagement and a student attendance monitoring system. 

When we compare transformation with turnaround schools, the results in 
Table 3 do not show systematic differences between these two types of schools 
in estimated intervention effects on either excused or unexcused absences. 


Family Preferences 


Among all students who submitted school preferences, roughly 35% 
applied for kindergarten, 18% applied for sixth grade, 31% applied for ninth 
grade, and 1%-2% applied for each of the remaining grade levels. Figure 3a 
plots the percentage of all students who listed a SIG school as their first 
choice among those submitting choice preferences. The trend for SIG 
schools’ popularity among families declined from 2005 to 2010, while the 
negative trend reversed after 2011. This pattern also emerges in logit regres- 
sion results in the first column of Table 4. Among all students submitting 
choices in Year 1, the odds that students selected a SIG school as their first 
choice significantly increased by 31% relative to the odds of making the 
same choice before the intervention, after accounting for student character- 
istics, distance from their home to the school, and school fixed effects. The 
odds that students listed a SIG school as their first choice increased by 65% in 
Year 2 and 117% in Year 3. 
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Figure 3. Percentage of students listing a SIG school as their first choice. 


Changes in the popularity of SIG schools varied by subgroup. Figures 3b 
and 3c show increasing trends during the intervention period among White 
and African American students in particular. In Year 3, the odds that both 
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African American and White students chose SIG schools were about twice as 
large as in the pre-SIG period. In contrast, Hispanic students became less 
likely to choose SIG schools in Year 3 and Asian students’ choices were 
not influenced by SIG designations. 

Figure 3f shows that high-achieving students (i.e., those scoring in the 
top 50% of the distribution of the prior-year average math and ELA scores 
in the district) became increasingly more likely to choose SIG schools during 
the reform period, as did students with at least one parent with a bachelor’s 
degree (Figure 3h). Specifically, the odds for high-achieving students listing 
a SIG school as their first choice significantly increased by 82% in Year 2, rel- 
ative to the odds that this group listed a SIG school first before the interven- 
tion. The odds that students from highly educated families ranked SIG 
schools as their first choice grew, on average, by 98% in Year 3. 

We find no significant differences in desirability in the first 2 years 
between transformation and turnaround schools, as shown in the last three 
columns in Table 4. 


Teacher Retention 


A key piece of the SIG reforms involved staff reconstitution to improve 
its effectiveness. As shown in Table 5, the estimated SIG effects on retaining 
math teachers with higher value-added are positive in all intervention years 
and statistically significant in both Years 1 and 3. With a 1 SD increase in 
a typical teacher’s value-added, the odds that this teacher remains in a SIG 
school significantly increased by 2.68 times in Year 1 and 1.78 times in 
Year 3, relative to the odds for similarly effective counterparts in non-SIG 
schools compared with the prereform years. We observe similar positive 
SIG effects for ELA teachers in these 3 years, with a particularly large effect 
in Year 2. Taken together, the results in both subjects provide evidence that 
the SIG schools were able to retain more effective teachers in the reform 
years than they had been able to do in prior years. 

Replacing effectiveness with teacher experience in Equation 4, we 
observe the opposite pattern. The odds that an experienced teacher stayed 
in a SIG school declined by 87% in Year 1, compared with the odds of turn- 
over for experienced teachers in non-SIG schools, after accounting for 
teacher value-added and other controls. This declining trend continued in 
Years 2 and 3. These findings indicate that during the reform, SIG schools 
became more likely to retain teachers based on their effectiveness and less 
likely to retain teachers based on seniority. 


Teacher Supports 


We document the ways in which SIG schools improved teacher capacity 
and instructional leadership by examining teacher reports of support for 
teaching. As shown in the last column in Table 5, there was no significant 
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difference in teacher-reported support after Year 1 of the SIG award. 
However, by the second year, SIG teachers reported a level of teacher sup- 
port that was 0.50 points higher than the level reported in non-SIG schools 
(an increase of 0.26 SD). By the spring of 2013, this difference was 0.81 
points higher (an increase of 0.41 SD). 


Robustness and Falsification Analysis 


The key assumption of the analytic approach used to analyze student 
achievement and absences is that the changes from pre- to postintervention 
periods in non-SIG schools provide a valid counterfactual for what would 
have happened in SIG schools if the interventions had not been implemented. 
Although we cannot prove this assumption, we closely examine the pre-SIG 
trends to assess a possible violation. As shown in Figure 1, the pre-SIG trends 
in achievement measures were almost parallel between SIG and non-SIG 
schools, suggesting common pre-SIG trends." As shown in Figure 3 for families’ 
school choices, the general trend was consistently decreasing in the prereform 
period and then sharply increased after the reform started in 2011. There is no 
significant sign of discontinuity in the pre-SIG trend. Only for absences do we 
find some cause for concern. Figure B3 shows parallel trends for the “stayers” 
sample. Figure 2 shows some closing of the gap in unexcused absences 
between SIG and non-SIG schools in the “all starters” sample prior to the 
reforms. We statistically test this threat to the common trends assumption for 
absences by adding pretreatment, school-specific trends to Equation 1. 
Results are included in the Model 2 values in Table 3. The estimated SIG effects 
are largely consistent with our main models—Model 1 values in Table 3. 

A second threat to the internal validity is the plausibility of other concur- 
rent events. That is, SIG effects could be invalidated if there were unob- 
served determinants of our outcome measures that varied both 
contemporaneously with the onset of SIG interventions and uniquely 
occurred in SIG schools. One such plausible event would be the Quality 
Teacher and Education Act (QTEA) in June 2008, which authorized SFUSD 
to collect $198 per parcel of taxable property annually for 20 years to 
fund a general increase in teacher salaries and support for school improve- 
ment initiatives. A vast majority of the funds were applied to all schools, 
except for 5% of the funds, which were used to provide $2,000 for teachers 
working in designated hard-to-staff schools. Although hard-to-staff schools 
under QTEA change over time, some of them also receive SIG awards. If 
there were no systematic differences in changes in student outcomes 
between SIG schools and other QTEA hard-to-staff schools, we would sus- 
pect that the observed SIG effects might be part of the QTEA effects, rather 
than due to SIG interventions. We compare SIG with non-SIG QTEA schools 
using Equation 1 and include the results in Panel A of Tables B9 and B10 in 
the online appendix. Alternatively, in Panel B of both tables, we exclude all 
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QTEA schools from the analysis and compare non-QTEA SIG schools with 
non-QTEA non-SIG schools. In both analyses, SIG schools experienced 
larger math and ELA achievement gains than non-SIG schools, greater reduc- 
tions in excused and unexcused absences, and increased popularity among 
parents. These findings provide evidence that QTEA is not a major threat to 
the inferences of identified SIG effects. 

A final alternative explanation for the positive trend would be that the 
gains made by SIG schools were largely due to mean reversion. Mean rever- 
sion describes the phenomenon that the lowest achieving schools were 
likely to experience larger than average gains in subsequent years (Figlio 
& Rouse, 2006). Were this true, the large test-score gains in SIG schools 
would not be the result of SIG interventions but rather would have occurred 
anyway because the lowest performing schools are likely to improve. To test 
this possibility, we created 10 pseudo-SIG schools using schools’ average 
proficiency in both math and ELA from 2005 to 2007—the 3 years before 
the school performance data were used for identifying the actual SIG eligible 
schools. These 10 pseudo-SIG schools were the lowest performing schools 
during that time interval and did not have a net gain of 50 points or more 
on Academic Progress Index (APD scores from 2004 to 2007, nor did they 
meet the statewide goals of 800 API in 2006-2007. In other words, the iden- 
tification of pseudo-SIG schools mimics the criteria used to identify SIG- 
eligible schools. To mimic the main analysis for the actual SIG schools, we 
created “pseudo” “stayers” and “all starters” samples. 

If mean-reversion errors explain the test score gains following the SIG 
reform, then one should also observe such an increase for pseudo-SIG schools 
from 2008 to 2010—the 3 pseudo years of intervention. The results are pre- 
sented in Panel B of Tables B11 (achievement) and B12 (absences and school 
choices) in the online appendix. The estimated pseudo-SIG effects are either 
in the opposite direction of the actual estimates of SIG effects or statistically 
insignificant. This suggests that mean reversion is not the explanation for 
the identified gains in student achievement and desirability or the reduction 
in unexcused absences in actual SIG schools during SIG reform years. 

We tested the degree to which our estimated SIG effects are robust to 
several other mechanisms and model specifications. In our sample, 6.6% 
of students repeated a grade. Controlling for grade repeaters did not change 
our results (Table B14 in the online appendix). Another alternate specifica- 
tion uses student fixed effects instead of controlling for students’ prior char- 
acteristics and achievement (Table B15), and results are generally consistent 
with our main estimates in Table 2. Last, we conducted the analyses by 
instrumenting the actual number of years enrolled in SIG schools using an 
intent-to-treat definition—the number of years students should be expected 
to be in SIG schools based on students’ initial enrollment in 2010 fall as the 
instrument. The estimated SIG effects, as shown in Table B17, remain posi- 
tive and significant. 
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School Improvement Grants highlight a national focus on improving 
lowest-performing schools through competitive incentives and highly pre- 
scriptive school reform frameworks. Since the SIG program began in 2009, 
more than 1,500 schools across the country have undertaken one of four 
interventions that require schools to institute specific changes aimed at rais- 
ing student outcomes. Although using comprehensive strategies to transform 
persistently lowest performing schools is not new, the scope of the SIG pro- 
gram and its highly prescriptive models distinguish it from earlier reforms. 
Rigorous evidence on SIG impacts on a variety of student outcomes and 
potential mechanisms of change can shed light on the next wave of school 
improvement efforts under the Every Student Succeeds Act (ESSA), which 
continues to use research-based evidence and dramatic strategies to trans- 
form low-performing schools. 

This study provides new evidence based on unique longitudinal data 
from SFUSD and includes a richer set of measures on both outcomes and 
mechanisms than those examined by earlier evaluations of SIGs and com- 
prehensive school reforms, more generally. We find that SIG reforms in 
SFUSD resulted in gradual improvements in the first 2 years, and significant 
positive changes on several measures of school performance by the third 
year of the grants. Specifically, SIG reforms narrowed the achievement 
gap between these lowest performing schools and the rest of the schools 
in the district from 0.80 SD in spring 2010 Gight before the reform) to 0.50 
SD in the third year of SIG. Equally important, SIG reforms reduced the 
odds of unexcused absences by 24% in Year 2 and improved school desir- 
ability among families, indicated by an increase in the odds of being families’ 
first choice by 117% in Year 3 relative to pre-SIG years. These positive effects 
that emerge during the course of the intervention are robust to a variety of 
alternative explanations, such as student attrition, concurrent policies, and 
mean reversion. 

Several findings are consistent with prior studies on the SIG program 
and comprehensive school reform. The positive effects on student achieve- 
ment mirror Dee (2012) and Papay’s (2015) findings from other evaluations 
of SIG reforms. The larger positive effects in the third year relative to the first 
year echo earlier findings that comprehensive programs take time to yield 
impact (G. D. Borman et al., 2003; Bryk et al., 2010; de la Torre et al., 
2013). We find some evidence that the impacts in turnaround schools 
were more pronounced than those in transformation schools on raising 
year-to-year achievement and increasing popularity among families. Dee 
(2012) provides similar evidence of greater achievement gains in turnaround 
than transformation schools across California, and Dragoset et al. (2017) 
identify a more pronounced improvement in turnaround schools in second- 
ary grades. Moreover, Ahn and Vigdor (2014) find larger improvements 
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among schools that underwent similar restructuring processes under NCLB. 
Recent work by Strunk and colleagues shows that the use of dramatic turn- 
around methods (reconstitution and restart), as opposed to softer reform 
methods (transformation), produced larger positive improvement on student 
achievement (Strunk, Marsh, Hashim, Bush-Mecenas, & Weinstein, 2016). 

Our study adds new evidence to the literature on SIGs and whole-school 
reform. Ours is one of the few large-scale studies to examine the SIG effects 
as they unfold during all 3 years of reform. Different from prior studies that 
included schools that adapt reform strategies with a wide range of rigor 
(Dragoset et al., 2017), the SFUSD’s reform plan was closely based on 
Bryk and colleagues’ (2010) research-based guidelines for successful school 
improvement. The overall positive findings in this study illustrate a case that 
uses rigorous evidence to inform the development of a theory of change and 
its implementation. Moreover, we employ multiple outcome measures that 
examine specific elements of SFUSD’s theory of action. Besides document- 
ing increased student achievement, reduced unexcused absences, and 
increased popularity among parents, we provide evidence on educator 
capacity building. SIG schools became more likely to retain more effective 
teachers and improved teacher-reported professional support. The evidence 
on the multidimensional SIG reforms in SFUSD shows that comprehensive 
school transformation can succeed in a complex system. 

Accompanied with the novelty of this study, it has several caveats. The 
SIG interventions include two major components: evidence-based interven- 
tions and substantial financial investment. Our data cannot disentangle the 
program effect from the financial effect. Although it is desirable to know 
which components of the SIG interventions are most likely to contribute 
to the positive outcomes, we cannot separate the unique contribution of 
each component, because given the nature of the whole-school reform, all 
components are mingled together and implemented concurrently. 
Additionally, our data are drawn from one school district. Although it is 
demographically heterogeneous, it may represent a unique case where 
schools carried out a successful implementation of an evidence-based 
reform. We cannot be certain about the generalizability of these positive 
impacts on SIG programs elsewhere. However, it is worth emphasizing 
that this successful case sheds light on the promises of transforming persis- 
tently low-achieving schools and closing achievement gaps between 
schools. 

Despite the caveats, the findings of this study have timely policy impli- 
cations. ESSA continues to prioritize turning around persistently low- 
performing schools on the nation’s education reform agenda. As opposed 
to interventions driven by federal mandate, ESSA gives states and districts 
much more flexibility in which actions they take to support struggling 
schools (Sun, Saultz, & Ye, 2016). It is then all the more important to provide 
states and districts with guidance for choosing and implementing effective 
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reforms. The positive impacts of SIG reform in SFUSD add growing evidence 
in support of school transformation guided by evidence-based frameworks. 
Last, because comprehensive school reforms take time to implement, an 
important design feature to underscore is the gradual emergence and inten- 
sification of reform impacts, suggesting that such efforts should be given 
time to come to fruition. 
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‘Based on the Academic Performance Index (API) and proficiency rate from state 
standardized tests. 

*This school closed at the end of 2011—1 year after receiving $50,000 to support a 
parent-community outreach coordinator to assist students in transitioning to new schools. 

°We conduct further robustness checks to see whether SIGs have a larger or smaller 
impact on students who later transferred than students who stayed. If SIGs have had a sig- 
nificantly larger (or smaller) effect on students who transferred, the estimates from the 
“stayers” sample in Year 2 and 3 would have been underestimated (or overestimated). 
To understand the impact of excluding students who transferred, similar to a conventional 
difference-in-differences-in-differences (DDD) framework, we included interaction terms 
(e.g., treatment year 1 indicator*SIG school indicator*indicator of whether the student 
later transferred out; treatment year 2 indicator*SIG school indicator*indicator of 
whether the student later transferred out; treatment year 3 indicator*SIG school indica- 
tor*indicator of whether the student later transferred out) to Equation 1, as well as rele- 
vant two-way interactions. Results are included in online appendix OB-18. While there 
seems to be some evidence of differential effects in Model 1, after controlling for students’ 
characteristics and performance prior to the SIG reform in Model 2 and Model 3, we do 
not see any differential effects of SIGs between students who stayed and those who later 
transferred. The SIG effect estimates remain positive and significant for math in Year 3. 

“The choice of comparison group may also influence the SIG effect estimates. To exam- 
ine the degree to which our results are sensitive to the choice of comparison groups, we used 
the state criteria for defining SIG-eligible schools to choose an alternative plausible compar- 
ison group. The state of California published a list of all eligible schools. For each school, the 
state also published eligibility measures, including their Academic Progress Index (APD scores 
in the prior 3 years, graduation rates in each of the prior 5 years, tier level, and so on. Using 
the list and data on the state-identified eligible schools, we constructed one plausible compatr- 
ison group that includes 19 SFUSD schools that were similar to these nine SIG awardee 
schools in terms of API or graduation rates. As indicated in online appendix Table B13, the 
results are very much consistent with the SIG effect estimates in the main analysis in that 
SIG effects are largely positive and are particularly large in Year 3 of the intervention. 

>We also specified comparative interrupted time series models to estimate both level 
and trajectory changes. The results in online Table B5 indicate positive level change in SIG 
schools in some model specifications and consistently positive effects on the slope (e.g., 
trajectory) change, although only a few are statistically significant. Our data do not have 
enough power to simultaneously estimate several school-level treatment parameters. 
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°Starting in Grade 8, students in the same grade started to take different math courses 
and math examinations, such as Algebra I, Algebra II, or Geometry. To account for this, we 
include dummy indicators for the types of math examinations that a student took in 
Equation 1. When modeling math achievement in Grade 8 and above, we control for stu- 
dents’ prior test scores in seventh grade when all students took the same examination. 
Coefficient estimates are consistent with those calculated from Equation 1 at two decimal 
places, and statistical inferences are the same. Results are available upon request from the 
authors. In ELA, in contrast, students take a grade-specific examination in all tested grade 
levels, regardless of course content. 

7Cluster-robust standard errors account for correlations among observations within 
clusters and are more conservative than ordinary least squares (OLS) standard errors 
when fewer clusters are present (Cameron & Miller, 2015, p. 340). We also used cluster 
bootstrapping with 400 replications, as recommended by Cameron and Miller when clus- 
tered fixed effects are included (Cameron & Miller, 2015, p. 331). These two procedures 
yield similar standard errors in almost all of our model specifications of student achieve- 
ment. We include bootstrapped estimates in the online appendix Tables B6 and B7. 

‘We used both Poisson and negative binomial models to fit the count data—days of 
student absences. After comparing the model fit, we concluded that the negative binomial 
models fit the count data better because negative binomial models account for the over- 
dispersion of the data. 

*During the school years we examined, teachers used a paper Scantron to mark a stu- 
dent as absent or present in each class. For an absent student, a clerk in the school office 
would mark the student as excused absent if the clerk received a phone call from a parent 
or guardian providing reasons for absence; otherwise, the student was identified as unex- 
cused absent for that class. According to our interviews with several administrators in the 
district, attendance records may bias toward presence due to the funding of Average Daily 
Attendance (ADA), but this measurement error is on our dependent variable and should 
not bias our results. 

We also model changes among families’ top five choices. The estimates of SIG 
effects on the top five choices are generally similar to those on the first-choice schools. 
Results are available upon request from the authors. 

"We also estimate our models using the last 3 years of value-added, the last 4 years, and 
as many years of prior value-added measures available for a teacher. The results are qualita- 
tively consistent with those presented below. Results are available upon request from the 
authors. Our primary analyses use the three most recent years of data, because we expect 
principals or district leaders to rely most heavily on recent data to make staffing decisions. 

"Across each survey year, well over 1,200 teachers responded, with response rates 
ranging from 36% to 54%. Notably, average 4-year response rates in SIG schools and 
non-SIG schools were very similar, at 39.4% and 39.3%, respectively, suggesting that differ- 
ences in survey participation would not drive differences in teachers’ average responses. 

Sexploratory factor analysis results indicated one underlying factor (eigenvalue 

2.18) of teacher support and Cronbach’s a = 0.73-0.76 of all the items across all 4 years. 

“We included a full set of year fixed effects and their interactions with SIG treatment, 
which accounts for possible differential pre-SIG trends between SIG and non-SIG schools. 
The estimates of SIG effects in this alternative model specification, as shown in online 
appendix Table B8, are very consistent with our main model specifications. 
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